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Singular Value Decomposition (SVD) along with its related variation known 
as Principal Component Analysis is a powerful technique for data analysis in 
linear algebra which has found lot of applications in various fields such as Sig- 
nal Processing, Statistical Analysis, Biomedical Engineering , Genetics Analysis, 
Mathematical and Statistical Models, Graph Theory, Psychology etc In this 
paper, we discuss an extension of SVD for both the qualitative detection and 
quantitative determination of nonlinearity in a time series. SVD is performed 
on the embedding matrix created from data series. The conventional SVD can 
determine the form of linear relationships among data vectors. For the proposed 
method, the embedding matrix is augmented by nonlinear columns derived from 
the usual ones. Now if the SVD gives zero singular values there is a linear rela- 
tionship present among the columns. In that case, we could exactly determine 
the nonlinearity present in the data. The paper also demonstrates an application 
of nonlinear SVD to cryptanalysis where the encrypted signal is generated by a 
nonlinear transformation. Nonlinear methods are useful in cryptography as the 
signals to be decrypted are often generated by non-linear transformations. We 
have included examples of maps (Logistic map and Henon map) and fiows (Van 
der Pol oscillator and Duffing oscillator) to illustrate the method of nonlinear 
SVD to identify parameters. The paper presents the recovery of parameters in 
the following scenarios: (i) data generated by maps and ffows (ii) Comparison 
of the method for both noisy and noise-free data (iii) Surrogate data Analysis 
for both the noisy and noise-free cases 



I. INTRODUCTION 



Historically SVD has been used for finding the dimension of a linear system as it gives 
statistically independent set of variables which could span the state space. Standard SVD 
based methods known as Singular Spectrum Analysis were used for detecting nonlinearity 
in a qualitative manner An abrupt decrease in the profile of the singular spectrum 

is an indication of lower dimensional determinism. But this method fails to distinguish 
between a chaotic data and its surrogates 0, 4|. The failure of this method for some well 
known chaotic processes, and when the data is corrupted with noise are also reported j^, . 
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Bhattacharya and kanjilal proposed a method of quadratic scaling of singular values in order 
to detect the determinism in time series 3]- The decreasing singular values were weighted 
more to highlight the deterministic and stochastic features. This method could qualitatively 
distinguish between the data series and its surrogates. 

The Grassberger- Procaccia (G - P) algorithm was used to show that a finite correlation 
dimension of an irregular time series is an indication of underlying deterministic nonlinearity 
8j. G - P algorithm has a few drawbacks as it fails when the data is noisy and is unable 
to distinguish stochastic processes with power-law power-spectra from chaos j^, [l^. There 
are various methods to detect the underlying determinism in a time series. One uses the 
distribution of correlation coefficients as a qualitative method to distinguish chaos from noise 



or chaos 



But this method 



13| . Similarly in Statistics, the 



because the spectrum is flat for noise but gradually decays 
fails to distinguish between correlated noise and chaos [14 
distribution of sample autocorrelation function (ACF) is used as an important tool to assess 
the degree of dependance in data in order to select a model 1^. A constant sample ACF, 
which takes zero values for all the lags (delays) is an indication of data being independent and 
identically distributed (iid) noise. If the sample ACF spectrum shows decay or oscillations 
appropriate statistical parametric models could be used to model the data. There are 
developments in the field of both linear and nonlinear versions of Autoregressive(AR) and 
Autoregressive Moving Average(ARMA) models along with many robust algorithms; e. g. 
the Fast Orthogonal Searci method which can obtain correct model parameters irrespective 
of the model selection 15|. 

If one is interested in a detection of the existence of nonlinearity but not the determination 
of the underlying model A. Porta et. al. suggest an alternate method 16|,ll4|. Their method 
consists of using Takens embedding of data in a higher dimensional phase space and then 
subdividing the phase space into non-overlapping hypercubes. Prediction is based on the 
behavior of the median member of each hypercube. They find that an error function dips 
much further with actual data than with its surrogates. 

Linear and nonlinear AR and ARMA models are efficiently used for determining the 
nonlinearity; i.e. to find the parameters of the appropriate models using optimal parameter 
search (OPS) algorithms 3] and through various least square techniques: Least Square [l^. 



Total Least Square [2(| and Minimizing the Hypersurface Distance 



2l| . In this paper, we 



discuss a nonlinear extension of SVD to detect and exactly determine the nonlinearity with 
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an application to cryptography. The proposed method of nonhnear SVD for maps is similar 
;o the nonlinear AR and ARMA regression proposed by Lu. S et al and Marmarelis 



22l | . But the method is not limited to polynomial regression. Any deterministic nonlinearity 
present in the data could possibly be recovered. Nonlinear SVD method could be considered 
as functional regression in an extended phase space. 

Chaotic cryptanalysis is an emerging field of chaotic cryptography which deals with the 

jreakin g o 



23, m 



secret codes without any access to the super keys or parameters of the system 
25l |. It often deals with the problem of system identification from the encrypted 
data which could be noisy and incomplete. Consider two parties communicating across a 
private channel. The aim of cryptanalysis is to decrypt the message. The signal sent across 
the channel looks random but it is generated by a deterministic dynamical system. Since 
the intended recipient has some information about the system parameters (a key) she can 
retrieve the information from the encrypted signal. All that the cryptanalyst knows about 
the signal is that it must have been generated by a deterministic dynamical system. He has 
no clue about the parameters or the dimension of the system. Here we are proposing a new 
cryptanalysis tool based on SVD to find the information about a system from the encrypted 
signal. Nonlinear SVD and time delayed embedding together with the method of finding 
derivatives from data can be used to identify the nonlinearity of the system from the 
time series. 

The paper is organized as follows. Section I is introduction. Section II briefly reviews 
the conventional SVD technique. Section HI describes the method of nonlinear SVD and 
Section IV discuss the applicability of the method to data series. A numerical example of 
retrieval of Logistic map parameter from data is discussed in Section V. Section VI gives the 
comparison of nonlinear SVD and conventional SVD analysis of the data and its surrogates. 
Section VII discusses the extension of the method for higher order maps. Section VIII shows 
the numerical results in the presence of two types of noises: Uniform noise and Gaussian 
noise. Section IX is the extension of the method to flows and recovery of the nonlinearity 
from chaotic data is explained in Section X. Section XI is a discussion on linear and nonlinear 
models. Section XII is discussion on cryptanalysis and the conclusions is Section XIII. 
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II. SINGULAR VALUE DECOMPOSITION 



Singular Value Decomposition can be considered as a generalization of the spectral- 
decomposition of square matrices, to analyze rectangular matrices. SVD decomposes a 
rectangular matrix into three simple matrices: two orthogonal matrices and one diagonal 

n 

matrix [19!]. In general, SVD theorem can be stated as follows: any m x n matrix A, 
with m > n can be factored into three matrices: U (column orthogonal, m x n matrix), W 
(diagonal,n x n matrix) and V (orthogonal n x n matrix). When A is real, A = U.W.V'^ 
(where V'^ is the transpose of V"). For complex matrices, W remains real but U and V 
become unitary. The diagonal elements of W matrix are known as the singular values of A. 

This decomposition is a technique that works well with matrices that are either singular 
or else numerically very close to singular. SVD is also used to calculate pseudo- inverses 
when the natural inverse of the matrix does not exist 27|. SVD and pseudo- inverses are 



generally used in statistics for solving least square problems. Data compression using SVD 
is one of the standard applications in image processing 19|, |28 |. 



III. METHOD OF NONLINEAR SINGULAR VALUE DECOMPOSITION 

Given a time series generated by any system {X} = {Xi,X2,X3, . . . ,X7v} the aim of 
nonlinear system identification is (i) to detect if there is nonlinearity and (ii) to exactly 
determine the equation which generated the data. The current study is based on the data 
generated by nonlinear maps and fiows. We will begin by using the standard Takens embed- 
ding: a method of reconstruction of the state space with time delayed data segments known 
as embedding vectors [29]. The embedding matrix E is created from the time delayed 
vectors as follows. A typical embedding vector is the m dimensional embedding vector 
generated from the given time series. 

Y^ = {Xi X2 . . . Xra Y' ■ 

= iX2 X3 . . . Xm+1 )'^. 

= {Xi Xj+i . . . Xm+i Y ■ 
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Note that {Xi X2. . . )^ is a column vector and denotes the transpose of {Xi X2. . . X^ ) ■ 
The collection {Y^} is the time delay embedding of the given data. Let the embedding matrix 
E be created from k embedding vectors as follows. 



E 



(XT 



Xi X2 
X2 x^ 

Xk Xk+1 



x„ 



... X 



m+1 



(1) 



For nonlinear SVD, wc extend the embedding matrix E by adding nonlinear columns. 
Let F be the extended embedding matrix. 



F^[E : /1/2 .../, 



(2) 



The last i columns of F matrix, /i , /2 • • • /« are functions of the columns of E matrix, 
. . . E^™-^} where denotes the i^^ column of E. In general, a non-linear 

column refers to the square, cube, any other higher powers of a column, the product of two or 
more columns or any other kind of non-linearity such as exponentiation and trigonometric 
functions of the column. If there is a non-linear relationship between the time delayed 
vectors, it could be interpreted as a linear relationship between the time-delayed vectors and 
corresponding nonlinear columns. The dimension of F matrix is k x p where p = {m + i). 
This extended embedding matrix can be considered as a higher dimensional linear system. 
The singular value decomposition can find the linear relation between the embedding vectors 
and corresponding nonlinear columns, thus recovering the nonlinear relationship inherent in 
the data. SVD is performed on F to get. 



Therefore, 



FV = UW. 



(3) 



(4) 
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Expanding Eq. 4 



F. 



y<l> y<2> y<p> 



jj<l> u<2> jj<p> 



Wi^i ... 

W2,2 ... 









(5) 



Using partitions of V and W and expanding along the last column of V and W we get, 



^<l> p<2> p<p> 



u 



<1> 



U<2> 



U<P> 








p,p 



(6) 



+ ^2,p(F<2>) + . . . + i;,p(F<^>) = 0([/<^>) + 0(C/<2>) + . . . + (7) 

(Note that stands for the i*'^ column of U matrix. Wj^- is the element on i*'* row and 
j*^ column of matrix). If the p''' singular value of W is zero, 



Wp^p = 0. 



then 

T4,,(F<i>) + V,,p{F<'>) + . . . + F,,,(F<^>) = 0. 
Or, in another notation. 



(8) 



(9) 



Eq. 8 and 9 can be seen as a statement that the columns of F including those formed by 
the nonlinear functions {/i, f2 - ■ ■ fi} now span a linear vector space. This equation is true 
for all rows of F and hence by exploiting this relation, the dependance between the linear 
and nonlinear columns of the F matrix can be recovered. 



IV. NONLINEAR SVD OF DATA 

We begin with our assumption that the underlying equation is a function of the delay 
vectors. In case the data is noise free and the conventional SVD of data does not result in 
a small enough singular value then we try different F's of the form F — [E : /i /2 ■ ■ ■ fi] 
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as shown in Section III. We keep trying different /j's till the nonlinear SVD gives at least 
one nearly zero singular value. Next section shows a numerical example of data generated 
by Logistic map. We made a simple guess for the nonlinear function and it worked for that 
case. If it did not, we would have tried other functions. 

But when the data is noisy, the singular value Wp^p ^ even if we get the right F matrix 
of nonlinear functions. Hence we try different nonlinear functions and choose the F that 
gives the lowest value of Wp^p. Ideally we want (the ratio of p^^ singular value to the 1** 
singular value) to be below some preset criterion. As the noise level in the data increases 
chances are higher that the method of nonlinear SVD fails. Therefore our confidence in the 
estimated model equation goes down with the increase in noise. 

V. NUMERICAL EXAMPLE 

Consider the data generated by a Logistic map X^+i — A.X„.(1 — X^) where < X < 1 
and A is an unknown parameter. We show in this section how A could be retrieved from the 
data. Let the data series sampled at a chosen time delay r be {X^} — {Xi, X2, . . . , X^}. 
The data can be embedded in three dimensions using embedding vectors as shown in the 
following matrix : 

Xi X2 X3 
X2 X^ X4 

E — 

X^ Xq X'j. 

The dimension of E matrix is 5 x 3. After SVD operation on the embedding matrix E^ we 
observed that none of the singular values of E go to zero indicating no linear dependance 
present in the data. Now E matrix is extended to F matrix as follows, 

F ^[E : h] 

= [E<^> E<^> E<^> /i ] 
= [£;<!> £;<2> £;<3> e<'>^ ] 

So that a typical row of F is seen as [ Xp Xp+i Xp^2 Xp ] where p goes from 1 to 4. 
SVD of F gives a zero singular value 1^4,4 = indicating a hnear dependence between the 
first column Xn and the added nonlinear column Xn ■ Eq. 8 for this particular numerical 
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example is, 

Vi,4iXn) + V24Xn+l) + 1^3,4(^n+2) + V^AX^) = 0. (10) 

Substituting the numerical values, the exact equation can be recovered from Eq. 9 as 
follows. 

-0.696311X„ + 0.174078X„+i + 0X^+2 + 0.696311X2 ^ g 

4X„ - X„+i - 4X„2 = 0. 

4X„(1 — Xn) — Xn+1- 

For the numerical example, the initial condition that generated the trajectory was 0.02 and 
the selected parameter value A of the logistic map was 4. We have observed that the method 
of nonlinear SVD works reasonably well even in the presence of noise, provided the noise 
is below some threshold value. Let the data {Xn} be contaminated by additive noise {Pn} 
which is either gaussian or uniform, {Xn} = {Xn} + {Pn}- The same procedure can be done 
on the embedding matrix F created from the noisy data {Xn} to extract the nonlinearity. 
TABLE. 1 shows the estimated values of parameter A for the data generated by Logistic 
family of maps under the presence of different types of noises. The preset criterion for 
the noisy case was 10~^. For recovering the parameters, we made an assumption that the 
singular values smaller than this can be considered zero and the underlying equation is 
extracted as explained in the noise-free case. 

VI. COMPARISON OF NON-LINEAR SVD WITH STANDARD SVD AND SUR- 
ROGATES 

Fig. 1 shows the quadratically scaled singular value spectrum: n^.an versus n; where (7„ 
is the n*'^ singular value generated by the standard SVD on the embedding matrix E. The 
time series {X} generated from the Logistic map: Xn+i — AX„(1 — Xn) where < X„ < 1 
and A = 4 is used to create an embedding matrix E with unity delay as explained in section 
111. The dimension of the embedding matrix is selected as 21 x 21 for this particular example. 
Standard SVD operation gives 21 non-zero singular values: Note that the profile gradually 
increases and slowly comes down. Similarly for the nonlinear SVD operation, the embedding 
matrix F is generated as explained in section IV. The dimension of the embedding matrix 
is kept same as that of £■ i.e. 21 x 21 of which the last 10 columns are squares of the first 
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FIG. 1: quadratically scaled spectrum of singular values for the case of standard SVD (n^.(T„) and 
non-linear SVD (n^.p„). 

10 columns. Now SVD operation gives 21 singular values, out of which the last 10 are zero. 
Observe that the nonlinear SVD profile is significantly different from that of conventional 
SVD case as the former is 'flat' towards the end. Fig. 1, the spectrum for nonlinear SVD 
shows zero singular values compared to the standard SVD. There is a significant qualitative 
change in the spectrum. The profile of nonlinear SVD case drops to zero rapidly compared 
to the standard SVD case. 

Similar analysis is done for the surrogates. Surrogates are generated from the Fourier 
Transform of the data by randomizing the phases. K surrogate data series {Xsurr{k)} 
are generated from {X} such that {Xsurr{k)} and {X} have the same power spectrum 



30|]. Hence {Xsurr{k)} is considered as the nondeterministic counterpart of {X} . Fig. 
2 (i) and (ii) show the quadratically scaled spectra of singular values of the data and its 
surrogates for the standard SVD and nonlinear SVD respectively under noise free conditions. 
We observe that the selection of nonlinear columns has worked since the nonlinear SVD has 
given identically zero singular values as shown (ii) of Fig. 2. Moreover the method is able 
to clearly distinguish between the data and the surrogates. The figures also contain the 
spectrum for the original data {X} for a comparison with its surrogates. Similar analysis 
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FIG. 2: (i) Quadratically scaled spectrum of singular values by standard SVD on the data (n^.a^) 
and its surrogates {n^-crsi„) and (n^.cr52„)- (ii) Similar spectrum by nonlinear SVD for the data 
{n^.pn) and its surrogates {ri^-Psu) a^^d {n^-PS2n)- 

for noisy data is included in section VIII. 

Assume that we have varied the number and types of the nonlinear terms present in 
the embedding matrix F. For the case of data from quadratic map, we have added cubic 
columns in the F matrix instead of quadratic columns. The singular value spectrum looks 
qualitatively the similar to the quadratic case as shown in Fig. 3. But the exact relationship 
cannot be retrieved quantitatively, as none of the singular values go to zero. We need to try 
different embedding matrices and select the one which gives at least one nearly zero singular 
value. 

VII. RECOVERING NON-LINEARITY: HIGHER ORDER MAPS 

Let us now discuss the problem of recovering the non-linear equation when the data is 
generated from a higher order map of the following form, 

Xn+2 = fiXn, Xn^l). (11) 

The well known Henon map, falls in this category. 

X„+i = c-aXl + Yr,. (12) 

Yn+l = hXn. (13) 
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n 



FIG. 3: quadratically scaled singular value spectra by nonlinear SVD for different choices of F 
matrices on data generated by Logistic map(i) n^-Pn versus n ; where pn is the n*^ singular value 
of F with quadratic columns, and (ii) v? .(fin versus n ; where is the n*^ singular value of F with 
cubic columns 

We generated some data using this map and later assumed that only the X data is available. 
In that case it is more convenient to list this in the form of Eq. 11. With that form in mind we 
set up F to be [ 1 X„+2 -^n+i -^n (-^n+i^n) ]• Performing SVD on the embedding 

matrix F we get the following relationship between the iterates. 

= 0.496904 - 0.6956656X^+1 + 0.1490712X„- 0.496904X„+2- 
= 1- 1.4X^+1 + 0.3X„ - X„+2. 

Therefore, 

X„+2 = 1 - lAXl_,^ + 0.3X„. (14) 

This can be seen as equivalent to Eq. 12 and 13 with parameter values a = 1.4 b = 0.3 and 
c = 1. Thus using the proposed method the parameters are retrieved from the X data. 

Similarly Consider the case of the Logistic map again. We are once again required to find 
the parameters A, but all the od iterates of the time series are suppressed. In this case we 
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could consider the map of the form, 



X^+2^9{Xn). (15) 

The non-hnear SVD on the modified data can retrieve a quartic nonhnearity. The pre- 
dicted equation, for this particular example is of the form g{x) = Bx + Cx^ + Dx^ + Ex^ 
where B, C, D, E are functions of the parameter A. SVD on F = [ Xn Xn+2 Xj^ Xn^ Xn\ 
gives a zero singular value corresponding to the following relationship between the columns. 

= 14.7609X„ - X„+2 - 71.4725X„2 + 113.4232X„=^ - 56.7116X„^ 

Therefore, 

X„+2 = 14.7609X„ - 71.4725X„2 + 113.4232X„3 - 56.7116X/. (16) 

But for the Logistic map = AX„(1 — X„). Hence the second iterate can be written as 

a function of its first iterate as follows, 

= (A^)^n - (A^ + \^)X^^ + (2A=^)X„=^ - (A=^)X„1 (17) 
Comparing Eq. 16 and 17 we recover the value of the parameter A = 3.842. 

VIII. NUMERICAL RESULTS FOR NOISY DATA 

Fig. 3 (c) and (d) show the state space created from the noisy data for both the uniform 
and gaussian noises. Conventional SVD on the noisy data gives the value of Wp,p/Wi^i 
somewhere in the range (10~^, 10~^) for different noise levels of the data from Logistic and 
Henon Maps shown in TABLE I and II. For nonlinear SVD, we have set value of the criterion 
as 10~^. If the ratio below the preset value 10~^ the parameters are retrieved 

from the data. As the noise increases the method almost breaks down and even fails to 
distinguish between the data and its surrogates. 

We found that when the noise level was low, the non-linear SVD was able to recover 
the non-linearity. TABLE. 1 shows the estimated values of parameter a, b for the data 
generated by Logistic family of maps: = aiXn — a2Xj^ where < X^ < 1 for the 

values of a = 4 and 6 = 4 for different Peak Signal to Noise Ratio (PSNR) using the nonlinear 
SVD algorithm. PSNR is defined as 20. log{max(Signal) /^/MSE} where MSE is the Mean 
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FIG. 4: Time series (a) the chaotic data generated from the Logistic map: X i— > 4X(1 — X) where 
< X < 1 (b) The phase space, (c) Phase space for noisy data (added uniform noise N(0,l)with 
noise level 28.089 %) (d) Phase space for noisy data (added gaussian noise N(0,l)with noise level 
28.123 %). Noise level is defined as the ratio of the maximum noise value to the maximum signal 
value. 

Squared Error, the average of the square of the noise added to the signaL The proposed 
method works well when the PSNR is above 28 for Gaussian noise and 33 for Uniform noise. 
It seems that the method breaks down as the noise content in the signal increases (i.e. for 
lower PSNR values). Similarly Table. II shows the estimated values for parameter d, b, c for 
the Henon map Xn+i = c — aX^ + Yn, Yn+i = bXn in the presence of noise using nonlinear 
SVD. Again the method breaks down for higher noise contents (PSNR < 30) as we saw in 
the Logistic case. 
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TABLE I: Estimated values of estimates ai, 02 for the Logistic map: X^+i = aiXn — 02 where 
< Xn < 1 for the parameter values oi = 4 and 02 = 4 in the presence of noise using nonlinear 
SVD. Peak Signal to Noise Ratio (PSNR) defined as 20. log{max(Signal)/VMS'E}. 



Parameters 


Noise (Uniform Distribution) 


Noise (Gaussian Distribution) 


di 


02 


PSNR 


di 


02 


PSNR 




4.0367 


3.9941 


44.2314 


4.099 


4.0879 


43.4502 


ai = 4 


4.1427 


4.0206 


38.7308 


4.106 


4.0919 


41.8398 


a2 = 4 


3.8757 


3.8473 


37.7963 


4.247 


4.1568 


37.095 




4.2358 


4.1516 


36.4626 


4.299 


4.1484 


34.5231 




4.3895 


4.2506 


33.6095 


4.1563 


3.9952 


32.8561 




4.4809 


4.2783 


30.7117 


4.0301 


3.824 


28.8170 




4.429 


3.8507 


24.6316 


3.8041 


2.5627 


24.151 



TABLE II: Estimated values of parameters a, b, c for the Hcnon map 
Xn+i = c — aX^ + y„; y„+i = bXn in the presence of noise using nonlinear SVD. 



Parameters 


Noise (Uniform Distribution) 


Noise (Gaussian Distribution) 


d 


b 


c 


PSNR 


d 


b 


c 


PSNR 




1.389 


0.2896 


1.003 


46.863 


1.4001 


0.2998 


1.003 


44.5158 


a = 1.4 


1.3812 


0.2923 


1.010 


40.764 


1.3983 


0.2964 


1.005 


40.93 


b = 0.3 


1.357 


0.2769 


1.009 


36.178 


1.4 


0.3167 


0.9881 


35.93 


c = 1.0 


1.3717 


0.262 


1.0204 


32.975 


1.4376 


0.3258 


1.008 


34.113 




1.3665 


0.2758 


1.067 


30.4932 


1.396 


0.3279 


1.04 


31.336 




1.2969 


0.2680 


1.0349 


26.549 


1.5206 


0.3401 


1.08 


28.3245 




1.3268 


0.1566 


1.1489 


20.8353 


1.2512 


0.2917 


0.9547 


22.714 



Fig. 4 shows the quadratically scaled singular spectra by standard SVD on the logistic 
data (n^.an) and its surrogates {n^-crsi„) and {n'^-(Ts2„) along with similar spectrum by 
nonlinear SVD for the data {n^.pn) and its surrogates (n^.p5i„) and {n^.ps^n)- Fig- 4 (i) 
and (ii) show the case of gaussian noise N(0,1) with noise level 28.123% in the data. Fig. 
4 (iii) and (iv) show noise added from uniform distribution (0, 1) with noise level 28.089% . 
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FIG. 5: Quadratically scaled spectrum of singular values by standard SVD on the Logistic data 
{v?.an) and its surrogates (n^.cr^i^J and {n^ -(Jsin) along with similar spectrum by nonlinear SVD 
for the data [n^.pn) and its surrogates [n^-Psin) {n'^-ps2„) (i) and (ii) for the case of gaussian 
noise N(0,1) added noise level 27.484% (iii) and (iv) and for the case of uniform noise [0,1) added 
noise level 28.182%. 



Noise level is defined as the ratio of the maximum noise value to the maximum signal value. 
The size of the embedding matrix is kept constant for all the data and surrogate matrices 
for both the standard and nonlinear SVD operation. The surrogate data spectrum is added 
for the comparative analysis. It is clear that nonlinear SVD distinguishes the original data 
from its surrogates even under the presence of noise, provided the noise level is low. 

As we discussed in section I, if the goal is to detect the nonlinearity but not to determine 
the underlying model, one could use the method suggested by A. Porta et. al. In ref [igI the 
case of the logistic model itself is discussed with parameter value A = 3.7 under various noise 
levels. They have shown that an error function dips much further with actual data than 
with its surrogates. A comparison of this method and various other methods are discussed 



in ref 



17|. 
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FIG. 6: (i) Phase space of the original data (X, V) generated by Team A ; the highhghted portion 
shows the section of a short data segment of length 1000; B which was sent to team B. (ii) The 
temporal signal B. 

IX. GENERALIZATION TO DIFFERENTIAL EQUATIONS 

The same technique can be applied if the data is generated by a set of nonlinear differential 
equations. Here the goal is to identify a specific differential equation from the time series 
based on two assumptions (i) the sampling is frequent (ii) noise level is very low. We 
demonstrate here how this could be done by a combination of nonlinear SVD and the 
method of finding accurate derivative from data 2^. Consider the following narrative. 
Two teams A and B are using a communication channel for sending information across each 
other. The sender, A is generating data from a differential equation of the form 

Ix, = A-.. (18) 
^A'j = G{X,.X2). (19) 

A time series of length 50,000 was generated using Runge - Kutta method from an initial 
condition (1, —0.5) with sampling step size 0.001; from which a portion of Xi of length 
1000 was taken and send to team B. A hint has been given that G is a relatively simple 
multinomial function. 

We illustrate the method and the issues involved by showing how team B works for the 
unknown differential equation with a short time series and the sampling interval. Team 
B is left with a small time series Y{t) = Xi (accurate up to 10~^^) shown in Fig. 5 (b) 
along with an information that the step size was 0.001. B calculates n derivatives of Y{t) 

17 



(Y2, Is, • • • ^n+i) from Y{t) as mentioned in |26|. Yn is the n — 1*'* derivative of Y(t). Y(t) 
is denoted as Yi {the '0*^' derivative of Y(t)} for the rest of the section. As the first step 
B assumes that G is a nonhnear function consisting of hnear and quadratic terms of Yi and 
Y2 as shown below. 

G = CiYi + C2F2 + + 04^1.^2 + cslf- 

Hence the columns of the embedding matrix were chosen as [Fg Yi Y^ Y^ Y^ (F1F2)] and 
SVD is performed to get the singular values (201.33, 162.24, 83.95, 3.95, 0.44, 0.03). Since 
the noise level is quite low, B decides to go to the next step of adding cubic terms to the 
embedding matrix. Now the assumed nonlinear function is, 

G = c,Y, + C2Y2 + c,Y,^ + c,Yi + c,Yi.Y2 + ceY^^ + C7Y2' + cs{Yi''.Y2) + cg{Y2''.Yi)] 

and the corresponding embedding matrix is 

F = [Y, Y, Y2 Fi^ Y2' {Y,Y2) Fi^ ^2' (l^i'.1^2) {Y2'.Yi)]. 

Singular values of F are (181.0635, 115.3297, 44.2941, 2.1510, 22.1215, 0.1768, 0.0247, 
0.0044, 0.0003, (2.995 -10^^^) ). Since the last singular value is very small (in the order of 
10^^^ ) B assumes it to be zero and tries to recover the linear relationship corresponding to 
that singular value. Hence the retrieved coefficient array is, 

C =[1, 2.895, -0.237, 0, 0, 0, 0, 0, 0.237, f. 

Now team B has recovered the following equation, 

= Kj - O.237F2 + 0.237F2>^i^ + 2.895Fi. 
Fs = 0.237^2 - 0.237^2^1^- 2.895^1. 
Fa = 0.237^2(1 - Fi^) - 2.895Fi. 

F2 and F3 are the 1** and 2"'^ derivatives of Fi. B has calculated 10 derivatives for this 
particular numerical example. When Teams A and B get together B finds that A has used 
Van der Pol equation of the following form with parameter values k = 2.895 and c = 0.237. 

j^X2 = 0X2(1 - X^) - kX,. 
18 



Team B' s estimated values are k = 2.895 and c = 0.237 which are in agreement with the 
values used by A. The form of the equation and the parameter values are exactly predicted 
by B using the proposed method. Fig. 5 shows the phase space of Van der Pol oscillator 
and the temporal signal which was sent to Team B. 



X. CHAOTIC DATA GENERATED BY A DIFFERENTIAL EQUATIONS 

The same technique works for the chaotic data generated by a set of nonlinear differential 
equations. Here we explain how to identify the specific differential equation along with the 
parameters with the help of nonlinear SVD and the method of finding accurate derivative 



from data [26| as explained in previous section. Consider the data generated by from the 



duffing equation of the form. 

= A-,. (20) 

= -kXi - cX2 - SXi^ + A.cos{ujt). (21) 

Let the parameter values be: k = 0.01, c = 0.04496, S = 1, A = 1.02 and cu = 0.44964. 
Parameters are selected such that the system exhibit chaos. We ensured adherence to the 
conditions (i) the sampling was frequent and (ii) noise level was very low. The time series 
of short length Y{t) = Xi was sent to the receiver. The temporal signal Y{t) and phase 
space of the duffing oscillator is displayed in Fig. 6. The receiver is informed that the data 
is generated using an equation of the form, 

= X. (22) 

j^X2 = GiXi,X2)+A.cosiiut). (23) 

Where G is a low order polynomial function. Two additional information (i) sampling 
step size h = 0.01 and (ii) forcing frequency u = 0.44964 were given to the receiver. The 
parameters and the exact form of the equation can be retrieved from the data as shown 
below. 

Once again n derivatives (12, ^3, • • • ^n+i) are calculated. Now we assume that G is a 
nonlinear function consisting of linear, quadratic and cubic terms of Yi and Y2 along with 
the sinusoidal functions. G = CiqYi + C01F2 + C2oY^ + cuYi.Y2 + C02Y2 + CsoYi^ + Co3l2^ + 
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FIG. 7: Phase space of the original data {X, V) generated the duffing oscillator under chaos ; the 
highlighted portion shows the section of a short data segment of length 1000 which was used to 
identify the system. The temporal signal B is shown on the right side. 

C2i{Yi^.Y2) + Ci2(l2^-^i)] + P-sin{ujt) + Q.cos{ujt). The corresponding embedding matrix is 
F = [Fg Fi Y2 Yx Y<^ (FiFa) Y^ Y-£' {Y^ Y2) i^-i Yx) ^i^{u)t) cos{ut)\. SVD operation on 
F gives the set of singular values (46.1294, 28.0900, 20.9784, 18.8628, 12.5940, 6.8899, 
4.8714, 4.2835, 2.0742, 0.8881, 0.5682, 0). Since the last singular value is zero we can 
recover the linear relationship corresponding to that singular value. The coefficient array C 
is, 

(7 = [ 1, 0.01, 0.04496, 0, 0, 0, 1, 0, 0, 0, -0.98958, 0.24723]^. 
The following equation is retrieved from C. 

Fa + 0.01^1 + 0.04496^2 + l-lf - VO.989582 + 0.24723.cos(u;i) = 0. 

+ 0.01^1 + 0.04496^2 + l-lf - 1.02cos(u;t) = 0. 

The estimated parameter values and the form of the equation are in exact agreement with 
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the Duffing equation from which the data was generated. 



XI. LINEAR VERSUS NONLINEAR MODELS 



A second order hnear system X„ = a.X„_i + 6.X„_2 was simulated with parameter values 
a = —1.5 and b = —1. In the absence of noise the linear SVD had a sharper fall off than 
the nonlinear SVD. In the case of surrogate data neither linear SVD nor nonlinear SVD 
had a sharp fall off. In such cases on the grounds of parsimony alone the linear model 
would be chosen. In the presence of small amount of noise the same situation continues. 
However when the noise is beyond a certain value neither linear nor nonlinear SVD will show 
a substantial qualitative difference with the surrogate data to have any degree of confidence 
in either of the models 



XII. APPLICATION TO CRYPTANALYSIS 



We could be interested in cryptanalysis [31i, the study of code breaking, which involves 



decrypting the encrypted data, without an access to the secret key. Suppose Alice and Bob 
are communicating a secret across a private channel. The assumption is that the data can 
be read only by Bob, the intended recipient, and he has the key to decrypt the message. 
The aim of cryptanalysis is to decrypt the message without the key. The signal sent across 
the channel looks random but it is generated by a deterministic dynamical system. Since 
Bob has some information about the system parameters (secret key) he can retrieve the 
information from the encrypted signal. But the cryptanalyst. Eve is left with a random 
signal which is to be decrypted. All she knows about the signal is that it must have been 
generated by a deterministic dynamical system, preferably a non-linear one. She has no clue 
about the parameters or the dimension of the system. The possibility of trial and error is 
extremely tedious and time-consuming, and has a low probability of being successful. As 
compared to code-making, relatively little work has been done in cryptanalysis (hardly 1 in 
100 papers propose a new method of cryptanalysis). Here we are proposing a new method 
based on SVD to find the information about the non-linear system from the encrypted signal. 
Nonlinear Singular Value Decomposition and time delayed embedding can be used to identify 
the nonlinearity of the system from the data. As we have discussed before, in cryptography 
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the signals to be decoded are mostly generated by non-linear systems and complete data 
is not always available. A method which can extract the nonlinearity from the encrypted 
signal could be useful for app lications in cryptanalysis. This method is potentially useful 
for chaotic cryptography 



XIII. CONCLUSION 



We have proposed a non-linear extension of the singular value decomposition (SVD) 
technique by means of appending additional columns to the trajectory matrix which are 
non-linearly derived from the existing columns. We propose nonlinear SVD as a method 
which is useful for the qualitative detection and quantitative determination of nonlinearity 
from a short time series. We have demonstrated the utility of non-linear SVD for recovering 
the non-linear relationship from time series generated by discrete and continuous dynamical 
systems. As an example, we have demonstrated the results for the data from Logistic map, 
Henon Map, Van der Pol Oscillator and Duffing oscillator. The proposed method works 
quiet well in the presence of noise, for both Gaussian and Uniform noise (provided the noise 
level is not high). In principle, the method can work with any type of non-linearity. The 
paper contains a comparative analysis of the results for the data and its surrogates. 
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